Sequence Learning
نویسندگان
چکیده
منابع مشابه
Convolutional Sequence to Sequence Learning
A. Weight Initialization We derive a weight initialization scheme tailored to the GLU activation function similar to Glorot & Bengio (2010); He et al. (2015b) by focusing on the variance of activations within the network for both forward and backward passes. We also detail how we modify the weight initialization for dropout. A.1. Forward Pass Assuming that the inputs x l of a convolutional laye...
متن کاملConvolutional Sequence to Sequence Learning
The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks.1 Compared to recurrent models, computations over all elements can be fully parallelized during training and optimization is easier since the number of non-linearities is fi...
متن کاملUnsupervised Pretraining for Sequence to Sequence Learning
This work presents a general unsupervised learning method to improve the accuracy of sequence to sequence (seq2seq) models. In our method, the weights of the encoder and decoder of a seq2seq model are initialized with the pretrained weights of two language models and then fine-tuned with labeled data. We apply this method to challenging benchmarks in machine translation and abstractive summariz...
متن کاملMulti-task Sequence to Sequence Learning
Sequence to sequence learning has recently emerged as a new paradigm in supervised learning. To date, most of its applications focused on only one task and not much work explored this framework for multiple tasks. This paper examines three settings to multi-task sequence to sequence learning: (a) the one-to-many setting – where the encoder is shared between several tasks such as machine transla...
متن کاملSequence to Sequence Learning in Neural Network
Neural Network Elements. Deep learning is the name we use for “stacked neural networks”; that is, networks composed of several layers. The layers are made of nodes. A node is just a place where computation happens, loosely patterned on a neuronin the human brain, which fires when it encounters sufficient stimuli. Deep Neural Networks (DNNs) are powerful models that have achieved excellent perfo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neuron
سال: 2003
ISSN: 0896-6273
DOI: 10.1016/s0896-6273(03)00159-4